Classifier Adaptation with Non-representative Training Data
Identifieur interne : 001927 ( Main/Exploration ); précédent : 001926; suivant : 001928Classifier Adaptation with Non-representative Training Data
Auteurs : Sriharsha Veeramachaneni [États-Unis] ; George Nagy (informaticien) [États-Unis]Source :
- Lecture Notes in Computer Science [ 0302-9743 ] ; 2002.
Abstract
Abstract: We propose an adaptive methodology to tune the decision boundaries of a classifier trained on non-representative data to the statistics of the test data to improve accuracy. Specifically, for machine printed and handprinted digit recognition we demonstrate that adapting the class means alone can provide considerable gains in recognition. On machineprinted digits we adapt to the typeface, on hand-print to the writer. We recognize the digits with a Gaussian quadratic classifier when the style of the test set is represented by a subset of the training set, and also when it is not represented in the training set. We compare unsupervised adaptation and style-constrained classification on isogenous test sets of five machine-printed and two hand-printed NIST data sets. Both estimating mean and imposing style constraints reduce the error-rate in almost every case, and neither ever results in signi.cant loss. They are comparable under the first scenario (specialization), but adaptation is better under the second (new style). Adaptation is bene.cial when the test is large enough (even if only ten samples of each class by one writer in a 100- dimensional feature space), but style conscious classification is the only option with fields of only two or three digits.
Url:
DOI: 10.1007/3-540-45869-7_17
Affiliations:
Links toward previous steps (curation, corpus...)
- to stream Istex, to step Corpus: 000B14
- to stream Istex, to step Curation: 000B01
- to stream Istex, to step Checkpoint: 001041
- to stream Main, to step Merge: 001A07
- to stream Main, to step Curation: 001927
Le document en format XML
<record><TEI wicri:istexFullTextTei="biblStruct"><teiHeader><fileDesc><titleStmt><title xml:lang="en">Classifier Adaptation with Non-representative Training Data</title>
<author><name sortKey="Veeramachaneni, Sriharsha" sort="Veeramachaneni, Sriharsha" uniqKey="Veeramachaneni S" first="Sriharsha" last="Veeramachaneni">Sriharsha Veeramachaneni</name>
</author>
<author><name sortKey="Nagy, George" sort="Nagy, George" uniqKey="Nagy G" first="George" last="Nagy">George Nagy (informaticien)</name>
<affiliation><country>États-Unis</country>
<placeName><settlement type="city">Troy (New York</settlement>
<region type="state">État de New York</region>
</placeName>
<orgName type="lab" n="5">Institut polytechnique Rensselaer</orgName>
</affiliation>
</author>
</titleStmt>
<publicationStmt><idno type="wicri:source">ISTEX</idno>
<idno type="RBID">ISTEX:BA6AC24A377F2F9A6379DAC3467543B5C8B7A845</idno>
<date when="2002" year="2002">2002</date>
<idno type="doi">10.1007/3-540-45869-7_17</idno>
<idno type="url">https://api.istex.fr/document/BA6AC24A377F2F9A6379DAC3467543B5C8B7A845/fulltext/pdf</idno>
<idno type="wicri:Area/Istex/Corpus">000B14</idno>
<idno type="wicri:Area/Istex/Curation">000B01</idno>
<idno type="wicri:Area/Istex/Checkpoint">001041</idno>
<idno type="wicri:doubleKey">0302-9743:2002:Veeramachaneni S:classifier:adaptation:with</idno>
<idno type="wicri:Area/Main/Merge">001A07</idno>
<idno type="wicri:Area/Main/Curation">001927</idno>
<idno type="wicri:Area/Main/Exploration">001927</idno>
</publicationStmt>
<sourceDesc><biblStruct><analytic><title level="a" type="main" xml:lang="en">Classifier Adaptation with Non-representative Training Data</title>
<author><name sortKey="Veeramachaneni, Sriharsha" sort="Veeramachaneni, Sriharsha" uniqKey="Veeramachaneni S" first="Sriharsha" last="Veeramachaneni">Sriharsha Veeramachaneni</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Rensselaer Polytechnic Institute, 12180, Troy, NY</wicri:regionArea>
<placeName><region type="state">État de New York</region>
</placeName>
</affiliation>
<affiliation wicri:level="1"><country wicri:rule="url">États-Unis</country>
</affiliation>
</author>
<author><name sortKey="Nagy, George" sort="Nagy, George" uniqKey="Nagy G" first="George" last="Nagy">George Nagy (informaticien)</name>
<affiliation wicri:level="2"><country xml:lang="fr">États-Unis</country>
<wicri:regionArea>Rensselaer Polytechnic Institute, 12180, Troy, NY</wicri:regionArea>
<placeName><region type="state">État de New York</region>
</placeName>
<placeName><settlement type="city">Troy (New York</settlement>
<region type="state">État de New York</region>
</placeName>
<orgName type="lab" n="5">Institut polytechnique Rensselaer</orgName>
</affiliation>
</author>
</analytic>
<monogr></monogr>
<series><title level="s">Lecture Notes in Computer Science</title>
<imprint><date>2002</date>
</imprint>
<idno type="ISSN">0302-9743</idno>
<idno type="ISSN">0302-9743</idno>
</series>
<idno type="istex">BA6AC24A377F2F9A6379DAC3467543B5C8B7A845</idno>
<idno type="DOI">10.1007/3-540-45869-7_17</idno>
<idno type="ChapterID">17</idno>
<idno type="ChapterID">Chap17</idno>
</biblStruct>
</sourceDesc>
<seriesStmt><idno type="ISSN">0302-9743</idno>
</seriesStmt>
</fileDesc>
<profileDesc><textClass></textClass>
<langUsage><language ident="en">en</language>
</langUsage>
</profileDesc>
</teiHeader>
<front><div type="abstract" xml:lang="en">Abstract: We propose an adaptive methodology to tune the decision boundaries of a classifier trained on non-representative data to the statistics of the test data to improve accuracy. Specifically, for machine printed and handprinted digit recognition we demonstrate that adapting the class means alone can provide considerable gains in recognition. On machineprinted digits we adapt to the typeface, on hand-print to the writer. We recognize the digits with a Gaussian quadratic classifier when the style of the test set is represented by a subset of the training set, and also when it is not represented in the training set. We compare unsupervised adaptation and style-constrained classification on isogenous test sets of five machine-printed and two hand-printed NIST data sets. Both estimating mean and imposing style constraints reduce the error-rate in almost every case, and neither ever results in signi.cant loss. They are comparable under the first scenario (specialization), but adaptation is better under the second (new style). Adaptation is bene.cial when the test is large enough (even if only ten samples of each class by one writer in a 100- dimensional feature space), but style conscious classification is the only option with fields of only two or three digits.</div>
</front>
</TEI>
<affiliations><list><country><li>États-Unis</li>
</country>
<region><li>État de New York</li>
</region>
<settlement><li>Troy (New York</li>
</settlement>
<orgName><li>Institut polytechnique Rensselaer</li>
</orgName>
</list>
<tree><country name="États-Unis"><region name="État de New York"><name sortKey="Veeramachaneni, Sriharsha" sort="Veeramachaneni, Sriharsha" uniqKey="Veeramachaneni S" first="Sriharsha" last="Veeramachaneni">Sriharsha Veeramachaneni</name>
</region>
<name sortKey="Nagy, George" sort="Nagy, George" uniqKey="Nagy G" first="George" last="Nagy">George Nagy (informaticien)</name>
<name sortKey="Veeramachaneni, Sriharsha" sort="Veeramachaneni, Sriharsha" uniqKey="Veeramachaneni S" first="Sriharsha" last="Veeramachaneni">Sriharsha Veeramachaneni</name>
</country>
</tree>
</affiliations>
</record>
Pour manipuler ce document sous Unix (Dilib)
EXPLOR_STEP=$WICRI_ROOT/Ticri/CIDE/explor/OcrV1/Data/Main/Exploration
HfdSelect -h $EXPLOR_STEP/biblio.hfd -nk 001927 | SxmlIndent | more
Ou
HfdSelect -h $EXPLOR_AREA/Data/Main/Exploration/biblio.hfd -nk 001927 | SxmlIndent | more
Pour mettre un lien sur cette page dans le réseau Wicri
{{Explor lien |wiki= Ticri/CIDE |area= OcrV1 |flux= Main |étape= Exploration |type= RBID |clé= ISTEX:BA6AC24A377F2F9A6379DAC3467543B5C8B7A845 |texte= Classifier Adaptation with Non-representative Training Data }}
This area was generated with Dilib version V0.6.32. |